Toward indicative discussion fora summarization
نویسنده
چکیده
Summarization of electronic discussion fora is a unique challenge; techniques that work startlingly well on monolithic documents tend to fare poorly in this informal setting. Additionally, conventional techniques ignore much of the structures that have the potential to serve as valuable features in the summarization task. We present several novel examples of such features, including the catalyst score, which is effective at identifying salient messages without looking at their content. We also describe and evaluate NewsSum, a prototype summarization system that is able to efficiently generate variable-length summarizations of Usenet threads.
منابع مشابه
MultiLing 2015: Multilingual Summarization of Single and Multi-Documents, On-line Fora, and Call-center Conversations
In this paper we present an overview of MultiLing 2015, a special session at SIGdial 2015. MultiLing is a communitydriven initiative that pushes the state-ofthe-art in Automatic Summarization by providing data sets and fostering further research and development of summarization systems. There were in total 23 participants this year submitting their system outputs to one or more of the four task...
متن کاملApplying Natural Language Generation to Indicative Summarization
The task of creating indicative summaries that help a searcher decide whether to read a particular document is a difficult task. This paper examines the indicative summarization task from a generation perspective, by first analyzing its required content via published guidelines and corpus analysis. We show how these summaries can be factored into a set of document features, and how an implement...
متن کاملConcept Identification And Presentation In The Context Of Technical Text Summarization
We describe a method of text summarization that produces indicative-informative abstracts / for technical papers. The abstracts are generated by a process of conceptual identification, topic extraction and re-generation. We have carried out an evaluation to assess indicative-ness and text acceptability relying on human judgment. The results so far indicate good performance in both tasks when co...
متن کاملComparable Fora
As the title suggests, our paper deals with web discussion fora, whose content can be considered to be a special type of comparable corpora. We discuss the potential of this vast amount of data available now on the World Wide Web nearly for every language, regarding both general and common topics as well as the most obscure and specific ones. To illustrate our ideas, we propose a case study of ...
متن کاملConcept Identi cation and Presentation in the Context of Technical Text Summarization
We describe a method of text summarization that produces indicative informative abstracts for technical papers The abstracts are gener ated by a process of conceptual identi cation topic extraction and re generation We have carried out an evaluation to assess indicative ness and text acceptability relying on human judgment The results so far indicate good per formance in both tasks when compare...
متن کامل